Nucleotide and deduced amino acid sequences of hypervariable region 1 of the envelope 2 gene of isolates of hepatitis C virus and the use of reagents derived from these hypervariable sequences in diagnostic methods and vaccines

ABSTRACT

The nucleotide and deduced amino acid sequences of hypervariable region 1 of the envelope 2 gene of 49 isolates of hepatitis C are disclosed. The invention relates to the use of these sequences to design proteins and nucleic acid sequences useful in diagnostic methods and vaccines.

FIELD OF INVENTION

The present invention is in the field of hepatitis virology. Theinvention relates to the nucleotide and deduced amino acid sequences ofhypervariable region 1 of the envelope 2 (E2) gene of hepatitis C virus(HCV) isolates from around the world and the grouping of thesehypervariable sequences into distinct HCV genotypes. More specifically,this invention relates to diagnostic methods and vaccines which employnucleic acid sequences and recombinant or synthetic proteins derivedfrom these hypervariable sequences.

BACKGROUND OF INVENTION

Hepatitis C, originally called non-A, non-B hepatitis, was firstdescribed in 1975 as a disease serologically distinct from hepatitis Aand hepatitis B (Feinstone, S. M. et al. (1975) N. Engl. J. Med.,292:767-770). Although hepatitis C was (and is) the leading type oftransfusion-associated hepatitis as well as an important part ofcommunity-acquired hepatitis, little progress was made in understandingthe disease until the recent identification of hepatitis C virus (HCV)as the causative agent of hepatitis C via the cloning and sequencing ofthe HCV genome (Choo, A. L. et al. (1989) Science, 288:359-362). Thesequence information generated by this study resulted in thecharacterization of HCV as a small, enveloped, positive-stranded RNAvirus and led to the demonstration that HCV is a major cause of bothacute and chronic hepatitis worldwide (Weiner, A. J. et al. (1990)Lancet, 335:1-3). Subsequently, it has been observed that approximately80% of individuals acutely infected with HCV become chronically infectedand more than 20% of these individuals eventually develop livercirrhosis (Alter, H. J. Seeff, L. B.: Transfusion Associated Hepatitis,In: Zuckerman, A. J. Thomas, H. C. (eds): Viral Hepatitis: ScientificBasis and Clinical Management. Edinburgh Churchill Livingstone, 1993).In addition, a strong association has been found between HCV infectionand the development of hepatocellular carcinoma (Bukh et al. (1993)Proc. Natl. Acad. Sci. USA, 90:1848-1851) and HCV infection also seemsto be associated with other diseases, including some autoimmune diseases(Manns, M. P. (1993) Intervirol., 35:108-115; Lionel, F. (1994)Gastroenterology, 107:1550-1555). Thus, significant morbidity andmortality is caused by HCV infection worldwide and vaccine developmentis a high priority.

Choo et al. ((1994) Proc. Natl. Acad. Sci. USA, 91:1294-1298), usingrecombinant E1 and E2 proteins of HCV-1 as immunogens, reported thesuccessful vaccination of chimpanzees against challenge with 10CID₅₀ ofthe homologous strain of HCV. However, Choo et al. did not demonstrateprotection against challenge with a heterologous strain of HCV and therecent discovery of the extraordinary diversity of HCV genomes based onsequence analysis of numerous HCV isolates (Bukh et al.; Proc. Natl.Acad. Sci. USA, (1993) 90:8234-8238, Bukh et al. (1994) Proc. Natl.Acad. Sci. USA, 91:8239-8243) suggests that a successful vaccine mustprotect against challenge by multiple strains of HCV. In addition, bothFarci et al. (Farci, P. et al. (1992) Science, 258:135-140) and Princeet al. (Prince, A. M. et al. (1992) J. Infect. Dis., 165:438-443) havepresented evidence that while infection with one strain of HCV doesmodify the degree of the hepatitis C associated with the reinfection, itdoes not protect against reinfection with a closely related strain.

One possible candidate for use as a immunogen in a vaccine protectiveagainst multiple strains of HCV is a short region within the E2 genetermed hypervariable region 1 (HVR1) that has many similarities to theV3 loop of HIV, which represents the principal neutralizing domain ofHIV (Letvin, N. L. (1993) N. Engl. J. Med., 329:1400). Indeed, therecent demonstration that antibodies specific to HVR1 can neutralize HCVin an in vitro binding assay (Zibert, A. et al. (1995) Virology,208:653-661) suggests that HVR1 may be a principal neutralizationdeterminant of HCV. Thus, the identification of HVR1 sequences frommultiple HCV isolates of different genotypes may be useful in developingan immunogen capable of stimulating a protective immune response againstchallenge by infection with HCV isolates.

SUMMARY OF INVENTION

The present invention relates to the nucleotide and deduced amino acidsequences of hypervariable region 1 (HVR1) of the envelope 2 (E2) geneof 49 human hepatitis C virus (HCV) isolates.

The invention also relates to proteins derived from the hypervariablesequences disclosed herein. These proteins may be synthesized chemicallyor may be produced recombinantly by inserting hypervariable nucleic acidsequences into an expression vector and expressing the recombinantprotein in a host cell.

The invention further relates to the use of these proteins, eitheralone, or in combination with each other, as diagnostic agents and asvaccines.

The invention further relates to the use of expression vectorscontaining the hypervariable nucleic acid sequences of the presentinvention as nucleic acid based vaccines.

This invention therefore relates to pharmaceutical compositions usefulin prevention or treatment of hepatitis C in a mammal.

The invention also relates to the use of single-stranded antisense poly-or oligonucleotides derived from HVR1 nucleic acid sequences to inhibitexpression of hepatitis C E2 genes.

The invention further relates to multiple computer-generated alignmentsof the nucleotide and deduced amino acid sequences of the HVR1sequences. These multiple sequence alignments produce consensussequences which serve to highlight regions of homology and non-homologybetween sequences found within the same genotype or in differentgenotypes and hence, these alignments can be used by those of ordinaryskill in the art to design proteins and nucleic acid sequences useful asreagents in diagnostic assays and vaccines.

The present invention also encompasses methods of detecting antibodiesspecific for hepatitis C virus in biological samples. The methods ofdetecting HCV or antibodies to HCV disclosed in the present inventionare useful for diagnosis of infection and disease caused by HCV and formonitoring the progression of such disease. Such methods are also usefulfor monitoring the efficacy of therapeutic agents during the course oftreatment of HCV infection and disease in a mammal.

The invention also provides a kit for the detection of antibodiesspecific for HCV in a biological sample where said kit contains at leastone purified and isolated protein derived from the hypervariablesequences.

The invention also relates to methods for detecting the presence ofhepatitis C virus in a mammal, said methods comprising analyzing the RNAof a mammal for the presence of hepatitis C virus. These methods can beused to identify specific isolates of hepatitis C virus present in amammal which is useful in determining the proper course of treatment foran HCV-infected patient.

The invention also provides a diagnostic kit for the detection ofhepatitis C virus in a biological sample. The kit comprises purified andisolated nucleic acid sequences useful as primers forreverse-transcription polymerase chain reaction (RT-PCR) analysis of RNAfor the presence of hepatitis C virus genomic RNA.

The invention also relates to antibodies to the HVR1 proteins of thepresent invention and the use of such antibodies in passiveimmunoprophylaxis.

DESCRIPTION OF FIGURES

FIGS. 1A-K show computer generated sequence alignments of the nucleotidesequences of the HVR1 region of the E2 gene of 49 HCV isolates. Thesingle letter abbreviations used for the nucleotides shown in FIGS. 1A-Kare those standardly used in the art. FIG. 1A shows the alignment of SEQID NOs:1-8 to produce a consensus sequence for subtype I/1a. FIGS. 1B-1and 1B-2 show the alignment of SEQ ID NOs:9-25 to produce a consensussequence for subtype II/1b. FIGS. 1C-1, 1C-2 and 1C-3 show the alignmentof SEQ ID NOs:1-25 to produce a consensus for genotype 1 where genotype1 comprises subtypes 1a (SEQ ID NOs:1-8) and 1b (SEQ ID NOs:9-25). FIG.1D shows the alignment of SEQ ID NOs:26-29 to produce a consensussequence for subtype III/2a. FIG. 1E shows the alignment of SEQ IDNOs:30-32 to produce a consensus sequence for subtype IV/2b. FIG. 1Fshows the alignment of SEQ ID NOs:26-33 to produce a consensus sequencefor genotype 2 where genotype 2 comprises subtypes 2a (SEQ IDNOs:26-29), 2b (SEQ ID NOs:30-32) and 2c (SEQ ID NO:33). FIG. 1G showsthe alignment of SEQ ID NOs:34-38 to produce a consensus sequence forgenotype V/3a. FIG. 1H shows the computer alignment of SEQ ID NOs:41-42to produce a consensus sequence for subtype 4c. FIG. 1I shows thealignment of SEQ ID NOs: 39-43 to produce a consensus sequence forgenotype 4 where genotype 4 comprises subtypes 4a (SEQ ID NO:39), 4b(SEQ ID NO:40), 4c (SEQ ID NOs:41-42) and 4d (SEQ ID NO:43). FIG. 1Jshows the alignment of SEQ ID NOs:44-48 to produce a consensus sequencefor genotype 5a. FIGS. 1K-1 and 1K-4 show the alignment of the HVR1sequences of the 49 HCV isolates (SEQ ID NOs: 1-49) to produce aconsensus sequence for all genotypes. The nucleotides shown in capitalletters in the consensus sequences of FIGS. 1A-1K are those conservedwithin a genotype (FIGS. 1A-J) or among all isolates (FIGS. 1K-1 and1K-4) while nucleotides shown in lower case letters in the consensussequences are those variable within a genotype (FIGS. 1A-J) or among allisolates (FIGS. 1K-1-1K-4). In addition, when the lower case letter isshown in a consensus sequence, the lower case letter represents thenucleotide found most frequently in the sequences aligned to produce theconsensus sequence. Finally, a hyphen at a nucleotide position in theconsensus sequences in FIGS. 1A-K indicates that two nucleotides werefound in equal numbers at that position in the aligned sequences. In thealigned sequences, nucleotides are shown in lower case letters if theydiffered from the nucleotides of both adjacent isolates.

FIGS. 2A-K show computer alignments of the deduced amino acid sequencesof amino acid sequences of the HVR1 region of the envelope 2 gene of 49isolates of HCV. The single letter abbreviations used for the aminoacids shown in FIGS. 2A-K follow the conventional amino acid shorthandfor the twenty naturally occurring amino acids. FIG. 2A shows thealignment of SEQ ID NOs:50-57 to produce a consensus sequence forsubtype I/1a. FIG. 2B shows the alignment of SEQ ID NOs:58-74 to producea consensus sequence for subtype II/1b. FIGS. 2C shows the alignment ofSEQ ID NOs:50-74 to produce a consensus sequence for genotype 1 wheregenotype 1 comprises subtypes 1a (SEQ ID NOs:50-57) and 1b (SEQ IDNOs:58-74). FIG. 2D shows the alignment of SEQ ID NOs:75-78 to produce aconsensus sequence for subtype III/2a. FIG. 2E shows the alignment ofSEQ ID NOs:79-81 to produce a consensus sequence for subtype IV/2b. FIG.2F shows the alignment of SEQ ID NOs:75-82 to produce a consensussequence for genotype 2 where genotype 2 comprises subtypes 2a (SEQ IDNOs:75-78), 2b (SEQ ID NOs:79-81) and 2c (SEQ ID NO:82). FIG. 2G showsthe alignment of SEQ ID NOs:83-87 to produce a consensus sequence forgenotype V/3a. FIG. 2H shows the computer alignment of SEQ ID NOs:90-91to produce a consensus sequence for subtype 4c. FIG. 2I shows thealignment of SEQ ID NOs:88-92 to produce a consensus sequence forgenotype 4 where genotype 4 comprises subtypes 4a (SEQ ID NO:88), 4b(SEQ ID NO:89), 4c (SEQ ID NOs:90-91) and 4d (SEQ ID NO:92). FIG. 2Jshows the alignment of SEQ ID NOs:93-97 to produce a consensus sequencefor genotype 5a. FIGS. 2K-1 and 2K-2 shows the alignment of the HVR1amino acid sequences of the 49 HCV isolates (SEQ ID NOs: 50-98) toproduce a consensus sequence for all genotypes. The amino acids shown incapital letters in the consensus sequences of FIGS. 2A-K are thoseconserved within a genotype (FIGS. 2A-J) or among all isolates (FIG. 2K)while amino acids shown in lower case letters in the consensus sequencesare those variable within a genotype (FIGS. 2A-J) or among all isolates(FIGS. 2K-1 and 2K-2). In addition, when the lower case letter is shownin a consensus sequence, the letter represents the amino acid found mostfrequently in the sequences aligned to produce the consensus sequence.Finally, a hyphen at an amino acid position in the consensus sequencesof FIGS. 2A-K indicates that two amino acids were found in equal numbersat that position in the aligned sequences. In the aligned sequences,amino acids are shown in lower case letters if they differed from theamino acids of both adjacent isolates.

DETAILED DESCRIPTION OF INVENTION

The present invention relates to nucleotide and deduced amino acidsequences of hypervariable region 1 (HVR1) of the E2 gene of 49 isolatesof human hepatitis C virus (HCV) where HVR1 is defined as starting atamino acid 384 of the HCV polyprotein (Bukh, J. et al. (1995) Seminarsin Liver Disease, 15: 41-63; Hijikata, M. et al. (1991) Biochem.Biophys. Res. Comm., 175: 220-228; and Hijikata, M. et al. (1991) Proc.Natl. Acad. Sci. U.S.A., 88: 5547-5551) The nucleic acid sequences ofthe present invention were obtained as follows. Viral RNA was extractedfrom serum collected from humans infected with hepatitis C virus and theviral RNA was then reverse transcribed and amplified by polymerase chainreaction using primers deduced from the sequence of the HCV strain H-77(Bukh, et al. (1993) Proc. Natl. Acad. Sci. U.S.A., 90:8234-8238). Theamplified cDNA was then isolated by gel electrophoresis and sequenced.

The HVR1 nucleotide sequences of the 49 HCV isolates are shown in thesequence listing as SEQ ID NO:1 through SEQ ID NO:49.

The abbreviations used for the nucleotides are those standardly used inthe art.

The deduced amino acid sequence of each of SEQ ID NO:1 through SEQ IDNO:49 are presented in the sequence listing as SEQ ID NO:50 through SEQID NO:98 where the amino acid sequence in SEQ ID NO:50 is deduced fromthe nucleotide sequence shown in SEQ ID NO:1, the amino acid sequenceshown in SEQ ID NO:51 is deduced from the nucleotide sequence shown inSEQ ID NO:2 and so on. The deduced amino acid sequence of each of SEQ IDNos:50-98 starts at nucleotide 1 of the corresponding nucleic acidsequence shown in SEQ ID NOs:1-49.

The three letter abbreviations used in SEQ ID NOs:50-98 follow theconventional amino acid shorthand for the twenty naturally occurringamino acids.

Preferably, the HVR1 proteins of the present invention are substantiallyhomologous to, and most preferably biologically equivalent to, nativeHCV HVR1 proteins. For purposes of the present invention, protein asused herein refers to a molecule containing a complete amino acidsequence shown in SEQ ID NOs 50-98 or a fragment of these sequences ofat least about 6 to about 8 amino acids in length. By "biologicallyequivalent" as used throughout the specification and claims, it is meantthat the compositions are immunogenically equivalent to the native HVR1proteins. The HVR1 proteins of the present invention may also stimulatethe production of protective antibodies upon injection into a mammalthat would serve to protect the mammal upon challenge with HCV. By"substantially homologous" as used throughout the ensuing specificationand claims to describe HVR1 proteins, it is meant a degree of homologyin the amino acid sequence of the HVR1 proteins to the native HVR1 aminoacid sequences disclosed herein. Preferably the degree of homology is inexcess of 80%, preferably in excess of 90%, with a particularlypreferred group of proteins being in excess of 95% homologous with thenative HVR1 amino acid sequences.

Variations are contemplated in the nucleic acid sequences shown in SEQID NO:1 through SEQ ID NO:49 which will result in a nucleic acidsequence that is capable of directing production of a protein having atleast six contiguous amino acids shown in SEQ ID NO:50 through SEQ IDNO:98 or an analog thereof. Due to the degeneracy of the genetic code,it is to be understood that numerous choices of nucleotides may be madethat will lead to a DNA sequence capable of directing production of theinstant protein or its analogs. As such, DNA sequences which arefunctionally equivalent to the sequences set forth above or which arefunctionally equivalent to sequences that would direct production ofHVR1 amino acid sequences set forth in SEQ ID NOs:50-98 or analogthereof are intended to be encompassed within the present invention.

The term analog as used throughout the specification or claims todescribe the HVR1 proteins of the present invention, includes anyprotein having an amino acid residue sequence substantially identical toa sequence specifically shown herein in which one or more residues havebeen conservatively substituted with a biologically equivalent residue.Examples of conservative substitutions include the substitution of onepolar (hydrophobic) residue such as isoleucine, valine, leucine ormethionine for another, the substitution of one polar (hydrophilic)residue for another such as between arginine and lysine, betweenglutamine and asparagine, between glycine and serine, the substitutionof one basic residue such as lysine, arginine or histidine for another,or the substitution of one acidic residue, such as aspartic acid orglutamic acid for another.

The phrase "conservative substitution" also includes the use of achemically derivatized residue in place of a non-derivatized residueprovided that the resulting protein is biologically equivalent to thenative HVR1 protein.

"Chemical derivative" refers to an HVR1 protein having one or moreresidues chemically derivatized by reaction of a functional side group.Examples of such derivatized molecules, include but are not limited to,those molecules in which free amino groups have been derivatized to formamine hydrochlorides, p-toluene sulfonyl groups, carbobenzoxy groups,t-butyloxycarbonyl groups, chloracetyl groups or formyl groups. Freecarboxyl groups may be derivatized to form salts, methyl and ethylesters or other types of esters or hydrazides. Free hydroxyl groups maybe derivatized to form O-acyl or O-alkyl derivatives. The imidazolenitrogen of histidine may be derivatized to form N-imbenzylhistidine.Also included as chemical derivatives are those proteins which containone or more naturally-occurring amino acid derivatives of the twentystandard amino acids. For examples: 4-hydroxyproline may be substitutedfor proline; 5-hydroxylysine may be substituted for lysine;3-methylhistidine may be substituted for histidine; homoserine may besubstituted for serine; and ornithine may be substituted for lysine. TheHVR1 proteins of the present invention also include any protein havingone or more additions and/or deletions of residues relative to thesequence of a peptide whose sequence is shown herein, so long as theprotein is biologically equivalent to the native HVR1 protein.

The present invention also relates to multiple computer-generatedalignments of the nucleotide and deduced amino acid sequences shown inSEQ ID NOs:1-98.

The grouping of SEQ ID NOs:1-49 into HCV genotypes is shown below.

    ______________________________________                                        SEQ ID NOs:     Subtypes Genotypes                                            ______________________________________                                         1-8             I/1a                                                                                       1                                                9-25           II/1b                                                         26-29           III/2a                                                        30-32           IV/2b         2                                               33              2c                                                            34-38           V/3a         3                                                39              4a                                                            40              4b                                                            41-42           4c            4                                               43              4d                                                            44-48           5a           5                                                49              6a           6                                                ______________________________________                                    

For those subtypes or genotypes containing more than one HVR1 nucleotidesequence, computer alignment of the constituent nucleotide sequences ofthe subtype or genotype was conducted using the program GENALIGN(Intelligenetics Inc. Mountainview, Calif.) in order to produce aconsensus sequence. These alignments and their resultant consensussequences are shown in FIGS. 1A-1J. Further alignment of the sequencesof all 49 HVR1 sequences to produce a consensus sequence for allgenotypes is shown in FIGS. 1K-1-1K-4. The consensus sequences shown inFIGS. 1A-K serve to highlight regions of homology and non-homologybetween sequences found within the same subtype or genotype or indifferent genotypes and hence, these alignments can be used by oneskilled in the art to select HVR1 sequences useful as reagents indiagnostic assays or vaccines.

The grouping of SEQ ID NOs:50-98 into HCV genotypes is shown below:

    ______________________________________                                        SEQ ID NOs:     Subtypes Genotypes                                            ______________________________________                                         50-57           I/1a                                                                                       1                                               58-74           II/1b                                                         75-78           III/2a                                                        79-81           IV/2b         2                                               82              2c                                                            83-87           V/3a         3                                                88              4a                                                            89              4b                                                            90-91           4c            4                                               92              4d                                                            93-97           5a           5                                                98              6a           6                                                ______________________________________                                    

For those subtypes or genotypes containing more than one HVR1 amino acidsequence, computer alignment of the constituent sequences of eachsubtype or genotype was conducted using the computer program GENALIGN inorder to produce a consensus sequence. These alignments and theirresultant consensus sequences are shown in FIGS. 2A-J. Alignment of all49 HVR1 sequences to produce a consensus amino acid sequence for allgenotypes is shown in FIGS. 2K-1 and 2K-2. The consensus sequences shownin FIGS. 2A-2K serve to highlight regions of homology and non-homologybetween HVR1 amino acid sequences of the same subtype or genotype and ofdifferent genotypes and hence, these alignments can readily be used bythose skilled in the art to design HVR1 proteins useful in assays andvaccines for the diagnosis and prevention of HCV infection.

In order to identify hydrophilic domains within HVR1 that mightrepresent antigenic determinants, a Kyte and Doolittle analysis (Kyte,J. and Doolittle, R. F. (1982) J. Mol. Biol., 157:105-132) of each ofthe amino acid sequences shown in SEQ ID NOS:50-98 was conducted. Theobserved hydrophilic domains for the amino acid sequences of each ofthese isolates is shown below where amino acid position 1 is theamino-terminal amino acid of the HVR1 amino acid sequences shown in SEQID NOs:50-98. (Note that all the amino acid sequences shown in SEQ IDNOs: 50-98 are 32 amino acids in length except for SEQ ID NOs 58 and 59(isolates D1 and D3 respectively) which are 36 amino acids in length dueto the presence of an additional four amino acids in their amino terminiand SEQ ID NO 98 which is lacking a single amino terminal amino acidrelative to SEQ ID NOs: 50-57 and 60-97 and five amino terminal aminoacids relative to SEQ ID NOs 58 and 59. Thus in the table below, thefirst four amino acids of SEQ ID NOs 58 and 59 are represented by thenumbers -4, -3, -2 and -1 while the first amino acid in SEQ ID NO: 98(isolate HK2) is assigned the number 2).

    ______________________________________                                        Type   Isolate   amino acid position of HVR 5→3                        ______________________________________                                        6a     HK2       2-6             9-13  23-28                                  5a     SA6       1-5             9-14  22-28                                  5a     SA13      1-5             9-13  22-28                                  5a     SA1       1-4             11-15 22-28                                  5a     SA7       1-2             11-14 23-28                                  5a     SA4       1-5             9-13  23-28                                  4c     Z6        1-4             9-15  22-28                                  4b     Z1        1-4             9-14  23-28                                  4a     Z4        1-4             7-13  22-28                                  3a     S2        1-5             9-14  23-28                                  3a     S52       1-5             12-15 23-28                                  2c     S83       1-5             9-15  22-28                                  2b     T8        1-6             9-13  22-28                                  1b     T3        1-4             11-14 23-28                                  1b     HK4       1-4             9-16  23-28                                  1b     HK3       1-4             10-16 23-28                                  1b     S9        1-2             8-14  23-28                                  1b     IND8      1-2             7-16  23-28                                  1b     T10       1-5             9-14  23-28                                  1b     DK1       1-3             8-14  23-28                                  1b     P10       1-6             12-16 23-28                                  1a     S18       1-5             8-16  23-28                                  1a     SW1       1-5             9-13  23-28                                  1a     S14       1-3             8-13  23-28                                  1a     US11      1-4             8-10  23-28                                  3a     S54       1-6             9-16  23-28                                  1b     IND5              1-14          22-28                                  1a     DR1               1-12          22-28                                  1b     D3        -4→1     9-13  23-28                                  1b     HK8       1-4             9-15  23-28                                  1a     DK9       1-5             9-14  23-28                                  1b     SA10              1-13          23-28                                  1b     S45               1-13          23-27                                  1b     D1                -4-14         23-28                                  1b     SW2               1-15          23-28                                  2a     T2                1-14          23-28                                  2a     T9                1-13          23-28                                  2b     DK8               1-14          23-28                                  1a     DK7       1-5             8-9   23-28                                  1a     DR4       1-5             9-12  22-28                                  1b     US6       1-4             8-16  22-28                                  1b     HK5       1-2             9-16  23-28                                  2a     T4        1-2             12-15 23-28                                  2a     US10      1-6             9-10  23-28                                  3a     HK10                      9-13  23-28                                  4d     DK13                      7-13  22-28                                  4c     Z7                        12-13 23-28                                  3a     DK12              1-14          23-28                                  2b     DK11      1-4             12-13 22-28                                  ______________________________________                                    

The data presented above illustrate that there are typically 3hydrophilic domains present in the HVR1 amino acid sequences shown inSEQ ID NOs:50-98. These hydrophilic domains are located at the amino andcarboxy termini of HVR1 and in roughly the middle of HVR1. Although allthree of these hydrophilic domains may represent important antigenicdeterminants, the carboxy terminal hydrophilic domain of about 6 aminoacids in length is of particular interest in that it is universallyconserved in the amino acid sequences shown in SEQ. ID NOs:50-98. Thisconservation of the C-terminal hydrophilic domain suggests that thisdomain may not only be an immunodominant epitope for HCV but may alsoplay an important role in the viral life cycle. Thus, amino acidsequences containing the C-terminal hydrophilic domains of SEQ IDNOs:50-98 are preferred immunogens in the vaccines of the presentinvention.

Accordingly, the present invention includes a recombinant DNA method forthe manufacture of HVR1 proteins in which natural or synthetic nucleicacid sequences may be used to direct the production of HVR1 proteinshaving at least six contiguous amino acids contained in the amino acidsequences shown in SEQ ID NOs:50-98.

In one embodiment of the invention, the method comprises:

(a) preparation of a nucleic acid sequence capable of directing a hostorganism to produce HVR1 protein;

(b) cloning the nucleic acid sequence into a vector capable of beingtransferred into and replicated in a host organism, such vectorcontaining operational elements for the nucleic acid sequence;

(c) transferring the vector containing the nucleic acid and operationalelements into a host organism capable of expressing the protein;

(d) culturing the host organism under conditions appropriate foramplification of the vector and expression of the protein; and

(e) harvesting the protein.

In another embodiment of the invention, the method for the recombinantDNA synthesis of an HCV HVR1 protein encoded by any one of the nucleicacid sequences shown in SEQ ID NOs:1-49 comprises:

(a) culturing a transformed or transfected host organism containing anucleic acid sequence capable of directing the host organism to produceHVR1 protein, under conditions such that the protein is produced, saidprotein exhibiting substantial homology to a native HVR1 protein havingan amino acid sequence according to any one of the amino acid sequencesshown in SEQ ID NOs:50-98.

In one embodiment, the RNA sequence of an HCV isolate was isolated andconverted to cDNA as follows. Viral RNA was extracted from a biologicalsample collected from human subjects infected with hepatitis C and theviral RNA is then reverse transcribed and amplified by polymerase chainreaction using primers deduced from the sequence of HCV strain H-77 asdescribed in Bukh et al. ((1993) Proc. Natl. Acad. Sci. USA,90:8234-8238). Once amplified, the PCR fragments are isolated by gelelectrophoresis and sequenced. This approach was used to obtain thenucleic acid sequences shown in SEQ ID NOs:1-49. In an alternativeembodiment, a nucleic acid sequence capable of directing host organismsynthesis of the given HVR1 protein may be synthesized chemically andinserted into an expression vector.

The vectors contemplated for use in the present invention include anyvectors into which a nucleic acid sequence as described above can beinserted, along with any preferred or required operational elements, andwhich vector can then be subsequently transferred into a host organismand replicated in such organisms. Preferred vectors are those whoserestriction sites have been well documented and which contain theoperational elements preferred or required for transcription of thenucleic acid sequence.

The "operational elements" as discussed herein include at least onepromoter, at least one operator, at least one leader sequence, at leastone terminator codon, and any other DNA sequences necessary or preferredfor appropriate transcription and subsequent translation of the vectornucleic acid. In particular, it is contemplated that such vectors willcontain at least one origin of replication recognized by the hostorganism along with at least one selectable marker and at least onepromoter sequence capable of initiating transcription of the nucleicacid sequence.

In construction of the recombinant expression vectors of the presentinvention, it should additionally be noted that multiple copies of thenucleic acid sequence of interest and its attendant operational elementsmay be inserted into each vector. In such an embodiment, the hostorganism would produce greater amounts per vector of the desired HVR1protein. The number of multiple copies of the nucleic acid sequencewhich may be inserted into the vector is limited only by the ability ofthe resultant vector due to its size, to be transferred into andreplicated and transcribed in an appropriate host microorganism.

Of course, those of ordinary skill in the art would readily understandthat multiple copies of different HVR1 nucleic acid sequence may beinserted into a single vector such that a host organism transformed ortransfected with said vector would produce multiple HVR1 proteins. Forexample, a polycistrionic vector in which multiple different HVR1proteins may be expressed from a single vector is created by placingexpression of each protein under control of an internal ribosomal entrysite (IRES) (Molla, A. et al. Nature, 356:255-257 (1992); Gong, S. K. etal. J. of Virol., 263:1651-1660 (1989)).

In another embodiment, restriction digest fragments containing asequence coding for HVR1 proteins can be inserted into a suitableexpression vector that functions in prokaryotic or eukaryotic cells. Bysuitable is meant that the vector is capable of carrying and expressinga complete nucleic acid sequence coding for an HVR1 protein. Preferredexpression vectors are those that function in a eukaryotic cell.Examples of such vectors include, but are not limited to, plasmid,vaccinia virus, adenovirus, retrovirus or herpes virus vectors.

In yet another embodiment, the selected recombinant expression vectormay then be transfected into a suitable eukaryotic cell system forpurposes of expressing the recombinant protein. Such eukaryotic cellsystems include but are not limited to cell lines such as HeLa, MRC-5 orCV-1 or other monkey kidney cell substrates.

The expressed recombinant protein may be detected by methods known inthe art including, but not limited to, Coomassie blue staining andWestern blotting.

The present invention also relates to substantially purified andisolated recombinant HVR1 proteins. In one embodiment, the expressedrecombinant protein can be obtained as a crude lysate or it can bepurified by standard protein purification procedures known in the artwhich may include differential precipitation, molecular sievechromatography, ion-exchange chromatography, isoelectric focusing, gelelectrophoresis and affinity and immunoaffinity chromatography. Therecombinant protein may be purified by passage through a columncontaining a resin which has bound thereto antibodies specific for HVR1protein.

Alternatively, those of ordinary skill in the art would be aware thatthe proteins of the present invention or analogs thereof can besynthesized by automated instruments sold by a variety of manufacturersor can be commercially custom-ordered and prepared. The term analog hasbeen described earlier in the specification and for purposes ofdescribing the proteins of the present invention, analogs can furtherinclude branched, cyclic or other non-linear arrangements of the aminoacid sequences of the present invention.

The present invention therefore relates to the use of recombinant orsynthetic HVR1 proteins as diagnostic agents and vaccines. In oneembodiment, the proteins of this invention can be used in immunoassaysfor diagnosing or prognosing hepatitis C in a mammal. For the purposesof the present invention, "mammal" as used throughout the specificationand claims, includes, but is not limited to humans, chimpanzees, otherprimates and the like. In a preferred embodiment, the immunoassay isuseful in diagnosing hepatitis C infection in humans.

Immunoassays of the present invention may be those commonly used bythose skilled in the art including, but not limited to,radioimmunoassay, Western blot assay, immunofluorescent assay, enzymeimmunoassay, chemiluminescent assay, immunohistochemical assay,immunoprecipitation and the like. Standard techniques known in the artfor ELISA are described in Methods in Immunodiagnosis, 2nd Edition, Roseand Bigazzi, eds., John Wiley and Sons, 1980 and Campbell et al.,Methods of Immunology, W. A. Benjamin, Inc., 1964, both of which areincorporated herein by reference. Such assays may be a direct, indirect,competitive, or noncompetitive immunoassay as described in the art(Oellerich, M. 1984. J. Clin. Chem. Clin. BioChem 22:895-904) Biologicalsamples appropriate for such detection assays include, but are notlimited to serum, liver, saliva, lymphocytes or other mononuclear cells.

In a preferred embodiment, test serum is reacted with a solid phasereagent having surface-bound recombinant HVR1 protein(s) as antigen(s).The solid surface reagent can be prepared by known techniques forattaching protein to solid support material. These attachment methodsinclude non-specific adsorption of the protein to the support orcovalent attachment of the protein to a reactive group on the support.After reaction of the antigen with anti-HCV antibody, unbound serumcomponents are removed by washing and the antigen-antibody complex isreacted with a secondary antibody such as labelled anti-human antibody.The label may be an enzyme which is detected by incubating the solidsupport in the presence of a suitable fluorimetric or calorimetricreagent. Other detectable labels may also be used, such as radiolabelsor colloidal gold, and the like.

The HCV HVR1 proteins and analogs thereof may be prepared in the form ofa kit, alone, or in combinations with other reagents such as secondaryantibodies, for use in immunoassays. It is understood by those ofordinary skill in the art that due to the variability between HVR1 aminoacid sequences between genotypes, the use of a single HVR1 protein as anantigen in the above-described immunoassays may be useful in detecting asingle genotype of HCV. Alternatively, the use of HVR1 proteins ofmultiple genotypes as antigens in the above-described immunoassays canserve as universal probes capable of detecting all genotypes of HCV.

In yet another embodiment, the HVR1 proteins or analogs thereof can beused as a vaccine to protect mammals against challenge with hepatitis C.The vaccine, which acts as an immunogen, may be a cell, cell lysate fromcells transfected with a recombinant expression vector or a culturesupernatant containing the expressed protein. Alternatively, theimmunogen is a partially or substantially purified recombinant proteinor a chemically synthesized protein. In a preferred embodiment, HVR1proteins having amino acid sequences found in multiple HCV isolates fromdifferent genotypes are administered together to provide protectionagainst challenge with multiple isolates of HCV or a synthetic protein.

While it is possible for the immunogen to be administered in a pure orsubstantially pure form, it is preferable to present it as apharmaceutical composition, formulation or preparation.

The formulations of the present invention, both for veterinary and forhuman use, comprise an immunogen as described above, together with oneor more pharmaceutically acceptable carriers and optionally othertherapeutic ingredients. The carrier(s) must be "acceptable" in thesense of being compatible with the other ingredients of the formulationand not deleterious to the recipient thereof. The formulations mayconveniently be presented in unit dosage form and may be prepared by anymethod well-known in the pharmaceutical art.

All methods include the step of bringing into association the activeingredient with the carrier which constitutes one or more accessoryingredients. In general, the formulations are prepared by uniformly andintimately bringing into association the active ingredient with liquidcarriers or finely divided solid carriers or both, and then, ifnecessary, shaping the product into the desired formulation.

Formulations suitable for intravenous intramuscular, subcutaneous, orintraperitoneal administration conveniently comprise sterile aqueoussolutions of the active ingredient with solutions which are preferablyisotonic with the blood of the recipient. Such formulations may beconveniently prepared by dissolving the solid active ingredient in watercontaining physiologically compatible substances such as sodium chloride(e.g. 0.1-2.0 M), glycine, and the like, and having a buffered pHcompatible with physiological conditions to produce an aqueous solution,and rendering said solution sterile. These may be present in unit ormulti-dose containers, for example, sealed ampules or vials.

The formulations of the present invention may incorporate a stabilizer.Illustrative stabilizers are preferably incorporated in an amount of0.10-10,000 parts by weight per part by weight of immunogens. If two ormore stabilizers are to be used, their total amount is preferably withinthe range specified above. These stabilizers are used in aqueoussolutions at the appropriate concentration and pH. The specific osmoticpressure of such aqueous solutions is generally in the range of 0.1-3.0osmoles, preferably in the range of 0.8-1.2. The pH of the aqueoussolution is adjusted to be within the range of 5.0-9.0, preferablywithin the range of 6-8. In formulating the immunogen of the presentinvention, an anti-adsorption agent may be used.

Additional pharmaceutical methods may be employed to control theduration of action. Controlled release preparations may be achievedthrough the use of polymer to complex or adsorb the proteins or theirderivatives. The controlled delivery may be exercised by selectingappropriate macromolecules (for example polyester, polyamino acids,polyvinyl pyrrolidone, ethylenevinylacetate, methylcellulose,carboxymethylcellulose, or protamine sulfate) and the concentration ofmacromolecules as well as the methods of incorporation in order tocontrol release. Another possible method to control the duration ofaction by controlled-release preparations is to incorporate theproteins, protein analogs or their functional derivatives, intoparticles of a polymeric material such as polyesters, polyamino acids,hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers.Alternatively, instead of incorporating these agents into polymericparticles, it is possible to entrap these materials in microcapsulesprepared, for example, by coacervation techniques or by interfacialpolymerization, for example, hydroxymethylcellulose orgelatin-microcapsules and poly (methylmethacylate) microcapsules,respectively, or in colloidal drug delivery systems, for example,liposomes, albumin microspheres, microemulsions, nanoparticles, andnanocapsules or in macroemulsions.

When oral preparations are desired, the compositions may be combinedwith typical carriers, such as lactose, sucrose, starch, talc, magnesiumstearate, crystalline cellulose, methyl cellulose, carboxymethylcellulose, glycerin, sodium alginate or gum arabic among others.

Vaccination can be conducted by conventional methods. For example, theimmunogen or immunogens can be used in a suitable diluent such as salineor water, or complete or incomplete adjuvants. Further, the immunogen(s)may or may not be bound to a carrier to make the protein(s) immunogenic.Examples of such carrier molecules include but are not limited to bovineserum albumin (BSA), keyhole limpet hemocyanin (KLH), tetanus toxoid,and the like. The immunogen(s) can be administered by any routeappropriate for antibody production such as intravenous,intraperitoneal, intramuscular, subcutaneous, and the like. Theimmunogen(s) may be administered once or at periodic intervals until asignificant titer of anti-HCV antibody is produced. The antibody may bedetected in the serum using an immunoassay. Doses of HVR1 protein(s)effective to elicit a protective antibody response against HCV infectionrange from about 0.1 to about 100 μg with a more preferred range beingabout 2 to about 20 μg.

In yet another embodiment, the immunogen may be a nucleic acid sequenceor sequence capable of directing host organism synthesis of HVR1protein(s). Such nucleic acid sequence(s) may be inserted into asuitable expression vector by methods known to those skilled in the art.Expression vectors suitable for producing high efficiency gene transferin vivo include retroviral, adenoviral and vaccinia viral vectors.Operational elements of such expression vectors are disclosed previouslyin the present specification and are known to one skilled in the art.Such expression vectors can be administered intravenously,intramuscularly, intradermally, subcutaneously, intraperitoneally ororally.

In an alternative embodiment, direct gene transfer may be accomplishedvia intramuscular injection of, for example, plasmid-based eukaryoticexpression vectors containing a nucleic acid sequence capable ofdirecting host organism synthesis of HVR1 protein(s). Such an approachhas previously been utilized to produce the hepatitis B surface antigenin vivo and resulted in an antibody response to the surface antigen(Davis, H. L. et al. (1993) Human Molecular Genetics, 2:1847-1851; seealso Davis et al. (1993) Human Gene Therapy, 4:151-159 and 733-740). Ina preferred embodiment, HVR1 nucleic acid sequences of isolates frommultiple genotypes of HCV are administered together to provideprotection against challenge with multiple genotypes of HCV.

Doses of HVR1 protein(s) -encoding nucleic acid sequence effective toelicit a protective antibody response against HCV infection range fromabout 0.5 to about 5000 μg. A more preferred range being about 10 toabout 1000 μg.

The HVR1 proteins and expression vectors containing a nucleic acidsequence capable of directing host organism synthesis of HVR1 protein(s)may be supplied in the form of a kit, alone, or in the form of apharmaceutical composition as described above.

The nucleic acid sequences of the present invention or primers/probesderived therefrom can also be used to analyze the RNA of a mammal forthe presence of specific hepatitis C virus isolates.

The RNA to be analyzed can be isolated from serum, liver, saliva,lymphocytes or other mononuclear cells as viral RNA, whole cell RNA oras poly(A)⁺ RNA. Whole cell RNA can be isolated by methods known tothose skilled in the art. Such methods include extraction of RNA bydifferential precipitation (Birnbiom, H. C. (1988) Nucleic Acids Res.,16:1487-1497), extraction of RNA by organic solvents (Chomczynski, P. etal. (1987) Anal. Biochem., 162:156-159) and extraction of RNA withstrong denaturants (Chirgwin, J. M. et al. (1979) Biochemistry,18:5294-5299). Poly(A)⁺ RNA can be selected from whole cell RNA byaffinity chromatography on oligo-d(T) columns (Aviv, H. et al. (1972)Proc. Natl. Acad. Sci., 69:1408-1412) or Poly(U) RNA can be selected byaffinity chromatography on oligo-d(A) columns. A preferred method ofisolating RNA is extraction of viral RNA by theguanidinium-phenol-chloroform method of Bukh et al. (1992a).

The methods for analyzing the RNA for the presence of HCV include, butare not limited to, Northern blotting (Alwine, J. C. et al. (1977) Proc.Natl. Acad. Sci., 74:5350-5354), dot and slot blot hybridization(Kafatos, F. C. et al. (1979) Nucleic Acids Res., 7:1541-1522), filterhybridization (Hollander, M. C. et al. (1990) Biotechniques; 9:174-179),RNase protection (Sambrook, J. et al. (1989) in "Molecular Cloning, ALaboratory Manual", Cold Spring Harbor Press, Plainview, N.Y.) andreverse-transcription polymerase chain reaction (RT-PCR) (Watson, J. D.et al. (1992) in "Recombinant DNA" Second Edition, W. H. Freeman andCompany, New York).

A preferred method for analyzing the RNA is RT-PCR. In this method, theRNA can be reverse transcribed to first strand cDNA using a primer orprimers derived from the nucleotide sequences shown in SEQ ID NOs:1-49or sequences complementary to those. Once the cDNAs are synthesized, PCRamplification is carried out using pairs of primers designed tohybridize with sequences in the hypervariable region which are anappropriate distance apart (at least about 50 nucleotides) to permitamplification of the cDNA and subsequent detection of the amplificationproduct. Each primer of a pair is a single-stranded oligonucleotide ofabout 15 to about 40 bases in length with a more preferred range beingabout 20 to about 30 bases in length where one primer (the "upstream"primer) is complementary to the original RNA and the second primer (the"downstream" primer) is complementary to the first strand of cDNAgenerated by reverse transcription of the RNA. Optimization of theamplification reaction to obtain sufficiently specific hybridization tothe nucleotide sequence of interest is well within the skill in the artand is preferably achieved by adjusting the annealing temperature.

The amplification products of PCR can be detected either directly orindirectly. In one embodiment, direct detection of the amplificationproducts is carried out via labelling of primer pairs. Labels suitablefor labelling the primers of the present invention are known to oneskilled in the art and include radioactive labels, biotin, avidin,enzymes and fluorescent molecules. The derived labels can beincorporated into the primers prior to performing the amplificationreaction. A preferred labelling procedure utilizes radiolabeled ATP andT4 polynucleotide kinase (Sambrook, J. et al. (1989) in "MolecularCloning, A Laboratory Manual", Cold Spring Harbor Press, Plainview,N.Y.). Alternatively, the desired label can be incorporated into theprimer extension products during the amplification reaction in the formof one or more labelled dNTPs. In the present invention, the labelledamplified PCR products can be detected by agarose gel electrophoresisfollowed by ethidium bromide staining and visualization underultraviolet light or via direct sequencing of the PCR-products.

In yet another embodiment, unlabelled amplification products can bedetected via hybridization with labelled nucleic acid probesradioactively labelled or, labelled with biotin, in methods known to oneskilled in the art such as dot and slot blot hybridization (Kafatos, F.C. et al. (1979) or filter hybridization (Hollander, M. C. et al.(1990)).

In one embodiment, the nucleic acid sequences used as probes areselected from, and substantially homologous to, SEQ ID NOs:1-49. In analternative embodiment, the sequence alignments shown in FIGS. 1A-1K maybe used to design hybridization probes.

The nucleic acid sequence used as a probe to detect PCR amplificationproducts of the present invention can be labeled in single-stranded ordouble-stranded form. Labelling of the nucleic acid sequence can becarried out by techniques known to one skilled in the art. Suchlabelling techniques can include radiolabels and enzymes (Sambrook, J.et al. (1989) in "Molecular Cloning, A Laboratory Manual", Cold SpringHarbor Press, Plainview, N.Y.). In addition, there are knownnon-radioactive techniques for signal amplification including methodsfor attaching chemical moieties to pyrimidine and purine rings (Dale, R.N. K. et al. (1973) Proc. Natl. Acad. Sci., 70:2238-2242; Heck, R. F.(1968) S. Am. Chem. Soc., 90:5518-5523), methods which allow detectionby chemiluminescence (Barton, S. K. et al. (1992) J. Am. Chem. Soc.,114:8736-8740) and methods utilizing biotinylated nucleic acid probes(Johnson, T. K. et al. (1983) Anal. Biochem., 133:126-131; Erickson, P.F. et al. (1982) J. of Immunology Methods, 51:241-249; Matthaei, F. S.et al. (1986) Anal. Biochem., 157:123-128) and methods which allowdetection by fluorescence using commercially available products.

The administration of the nucleic acid sequences or proteins of thepresent invention as immunogens may be for either a prophylactic ortherapeutic purpose. When provided prophylactically, the immunogen(s) isprovided in advance of any exposure to HCV or in advance of anysymptom(s) due to HCV infection. The prophylactic administration of theimmunogen serves to prevent or attenuate any subsequent infection of HCVin a mammal. When provided therapeutically, the immunogen(s) is providedat (or shortly after) the onset of the infection or at the onset of anysymptom of infection or disease caused by HCV or at any time thereafter.The therapeutic administration of the immunogen(s) serves to attenuateor eradicate the infection or disease.

In addition to use as a vaccine, the compositions can be used to prepareantibodies to the HVR1 protein. The antibodies can be used directly asantiviral agents or they may be used in immunoassays disclosed herein todetect the presence of the Hepatitis C virus in patient sera. To prepareantibodies, a host animal can be immunized using the HVR1 proteins ofthe present invention or expression vectors containing nucleic acidsequences encoding such proteins. The host serum or plasma is collectedfollowing an appropriate time interval to provide a compositioncomprising antibodies reactive with the HVR1 region protein of the virusparticle. The gamma globulin fraction or the IgG antibodies can beobtained, for example, by use of saturated ammonium sulfate or DEAESephadex, or other techniques known to those skilled in the art. Theantibodies are substantially free of many of the adverse side effectswhich may be associated with other anti-viral agents such as drugs.

The antibody compositions can be made even more compatible with the hostsystem by minimizing potential adverse immune system responses. This isaccomplished by removing all or a portion of the Fc portion of a foreignspecies antibody or using an antibody of the same species as the hostanimal, for example, the use of antibodies from human/human hybridomas.Humanized antibodies (i.e., nonimmunogenic in a human) may be produced,for example, by replacing an immunogenic portion of an antibody with acorresponding, but nonimmunogenic portion (i.e., chimeric antibodies).Such chimeric antibodies may contain the reactive or antigen-bindingportion of an antibody from one species and the Fc portion of anantibody (nonimmunogenic) from a different species. Examples of chimericantibodies, include but are not limited to, non-human mammal-humanchimeras, rodent-human chimeras, murine-human and rat-human chimeras(Robinson et al., International Patent Application 184,187; TaniguchiM., European Patent Application 171,496; Morrison et al., EuropeanPatent Application 173,494; Neuberger et al., PCT Application WO86/01533; Cabilly et al., 1987 Proc. Natl. Acad. Sci. USA 84:3439;Nishimura et al., 1987 Canc. Res. 47:999; Wood et al., 1985 Nature314:446; Shaw et al., 1988 J. Natl. Cancer Inst. 80:15553, allincorporated herein by reference).

General reviews of "humanized" chimeric antibodies are provided byMorrison S., 1985 Science 229:1202 and by Oi et al., 1986 BioTechniques4:214.

Suitable "humanized" antibodies can be alternatively produced by CDR orCEA substitution (Jones et al., 1986 Nature 321:552; Verhoeyan et al.,1988 Science 239:1534; Biedleret al. 1988 J. Immunol. 141:4053, allincorporated herein by reference).

The antibodies or antigen binding fragments may also be produced bygenetic engineering. The technology for expression of both heavy andlight chain genes in E. coli is the subject of the PCT patentapplications; publication number WO 901443, W0901443, and WO 9014424 andin Huse et al., 1989 Science 246:1275-1281.

The antibodies can also be used as a means of enhancing the immuneresponse. The antibodies can be administered in amounts similar to thoseused for other therapeutic administrations of antibody. For example,normal immune globulin is administered at 0.02-0.1 ml/lb body weightduring the early incubation period of other viral diseases such asrabies, measles, and hepatitis B to interfere with viral entry intocells. Thus, antibodies reactive with the HVR1 proteins can be passivelyadministered alone or in conjunction with another antiviral agent to ahost infected with an HCV to enhance the immune response and/or theeffectiveness of an antiviral drug.

Alternatively, antibodies to the HVR1 region can be induced byadministered anti-idiotype antibodies as immunogens. Conveniently, apurified antibody preparation prepared as described above is used toinduce anti-idiotype antibody in a host animal, the composition isadministered to the host animal in a suitable diluent. Followingadministration, usually repeated administration, the host producesanti-idiotype antibody. To eliminate an immunogenic response to the Fcregion, antibodies produced by the same species as the host animal canbe used or the Fc region of the administered antibodies can be removed.Following induction of anti-idiotype antibody in the host animal, serumor plasma is removed to provide an antibody composition. The compositioncan be purified as described above for anti-HVR1 antibodies, or byaffinity chromatography using anti-HVR1 antibodies bound to the affinitymatrix. The anti-idiotype antibodies produced or similar in conformationto the authentic HVR1 amino acid sequence may be used to prepare an HCVvaccine rather than using an HVR1 protein.

When used as a means of inducing anti-HCV virus antibodies in an animal,the manner of injecting the antibody is the same as for vaccinationpurposes, namely intramuscularly, intraperitoneally, subcutaneously orthe like in an effective concentration in a physiologically suitablediluent with or without adjuvant. One or more booster injections may bedesirable.

The HVR1 proteins of the invention are also intended for use inproducing antiserum designed for pre- or post-exposure prophylaxis. Herean HVR1 protein, or mixture of HVR1 proteins is formulated with asuitable adjuvant and administered by injection to human volunteers,according to known methods for producing human antisera. Antibodyresponse to the injected proteins is monitored, during a several-weekperiod following immunization, by periodic serum sampling to detect thepresence of anti-HVR1 serum antibodies, using an immunoassay asdescribed herein.

The antiserum from immunized individuals may be administered as apre-exposure prophylactic measure for individuals who are at risk ofcontracting infection. The antiserum is also useful in treating anindividual post-exposure, analogous to the use of high titer antiserumagainst hepatitis B virus for post-exposure prophylaxis.

For both in vivo use of antibodies to HVR1 proteins and anti-idiotypeantibodies and diagnostic use, it may be preferable to use monoclonalantibodies. Monoclonal anti-HVR1 protein antibodies or anti-idiotypeantibodies can be produced as follows. The spleen or lymphocytes from animmunized animal are removed and immortalized or used to preparehybridomas by methods known to those skilled in the art. (Goding, J. W.1983. Monoclonal Antibodies: Principles and Practice, Pladermic Press,Inc., New York, N.Y., pp. 56-97). To produce a human--human hybridoma, ahuman lymphocyte donor is selected. A donor known to be infected withHCV (where infection has been shown for example by the presence ofanti-virus antibodies in the blood or by virus culture) may serve as asuitable lymphocyte donor. Lymphocytes can be isolated from a peripheralblood sample or spleen cells may be used if the donor is subject tosplenectomy. Epstein-Barr virus (EBV) can be used to immortalize humanlymphocytes or a human fusion partner can be used to producehuman--human hybridomas. Primary in vitro immunization with peptides canalso be used in the generation of human monoclonal antibodies.

Antibodies secreted by the immortalized cells are screened to determinethe clones that secrete antibodies of the desired specificity. Formonoclonal antibodies to the HVR1 amino acid sequences disclosed herein,the antibodies must bind to HVR1 proteins. For monoclonal anti-idiotypeantibodies, the antibodies must bind to anti-HVR1 protein antibodies.Cells producing antibodies of the desired specificity are selected.

The present invention also relates to the use of single-strandedantisense poly- or oligonucleotides derived from nucleotide sequencessubstantially homologous to those shown in SEQ ID NOs:1-49 to inhibitthe expression of hepatitis C E2 genes. By substantially homologous asused throughout the specification and claims to describe the nucleicacid sequences of the present invention, is meant a level of homologybetween the nucleic acid sequence and the SEQ ID NOs. referred to in theabove sentence. Preferably, the level of homology is in excess of 80%,more preferably in excess of 90%, with a preferred nucleic acid sequencebeing in excess of 95% homologous with the DNA sequence shown in theindicated SEQ ID NO. These anti-sense poly- or oligonucleotides can beeither DNA or RNA. The targeted sequence is typically messenger RNA andmore preferably, a single sequence required for processing ortranslation of the RNA. The anti-sense poly- or oligonucleotides can beconjugated to a polycation such as polylysine as disclosed in Lemaitre,M. et al. ((1989) Proc. Natl. Acad. Sci. USA, 84:648-652) and thisconjugate can be administrated to a mammal in an amount sufficient tohybridize to and inhibit the function of the messenger RNA.

Any articles or patents referenced herein are incorporated by reference.The following examples illustrate various aspects of the invention butare in no way intended to limit the scope thereof.

EXAMPLE 1 Use Of HVR1 Protein Or Nucleic Acid Sequence Encoding HVR1Protein As A Vaccine

Mammals are immunized intradermally or intramuscularly with 2 to 20 μgof at least one HVR1 protein having an amino acid sequence of at leastsix contiguous amino acids selected from the amino acid sequence shownin SEQ ID NOs:50-98 or with 10 to 1000 μg of expression vectorcontaining at least one nucleic acid having a sequence of at least 15nucleotides selected from SEQ ID NOs:1-49 to stimulate production ofprotective antibodies. Those of ordinary skill in the art would readilyunderstand that the HVR1 protein or the expression vector containingHVR1 nucleic acid sequence can be used alone or in combination withother HVR1 proteins or other expression vectors containing differentHVR1 nucleic acid sequences presented herein. When HVR1 proteins ornucleic acid sequences from multiple isolates are used as immunogens,the immunized mammals are protected from challenge with multipleisolates of HCV.

EXAMPLE 2 Use Of Antisera To The HVR1 Protein Sequences In Pre- orPost-Exposure Prophylaxis

Antisera collected from a mammal injected with a protein having an aminoacid sequence of at least six contiguous amino acids selected from theamino acid sequences shown in SEQ ID NOS 50-98 or, a mixture of suchproteins, is administered intravenously to an individual post-exposureto HCV or is administered to an uninfected mammal in an amount effectiveto protect against hepatitis C infection. Such administration isrepeated one or more times at monthly intervals and serves to reduce theseverity of the HCV infection as indicated by, for example, diminishedreplication of HCV.

    __________________________________________________________________________    #             SEQUENCE LISTING                                                - (1) GENERAL INFORMATION:                                                    -    (iii) NUMBER OF SEQUENCES:  98                                           - (2) INFORMATION FOR SEQ ID NO:1:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S18     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                 #     39C TAC GCC ACT GGG GGG AGT GCC AGC AG - #G ACC ACG                     #     78G TTC ACT AGG TTC TTC TCT CCG GGC GC - #C AAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:2:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S14     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                 #     39C TAC ATC ACC GGG GGA ACT GCC GGT CG - #C ACC GTG                     #     78A CTC AGC AAT CTC CTC GCA CCG GGC GC - #C AAG CAG                     #  96              TT AAC                                                     - (2) INFORMATION FOR SEQ ID NO:3:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK7     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                 #     39C CAC GTC ACC GGG GGA ACT GCC GCC CG - #C GCT GCG                     #     78C ATT ACT AGT CTC TTT GCA CCA GGC GC - #C AAA CAG                     #  96              TC AGC                                                     - (2) INFORMATION FOR SEQ ID NO:4:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # US11    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                 #     39C TAC GTC ACC GGG GGA AGT GCC GGC CA - #T GCC GCG                     #     78A CTT GCT GGT CTT TTC TCA CAA GGC GC - #C CAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:5:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SW1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                 #     39C TAC ACC ACC GGG GGG GCT GCT GGT CA - #G ACC GCG                     #     78A TTC ACC AGT CTT TTC ACG CGG GGC GC - #C CAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:6:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK9     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                 #     39C CGC GTC ACC GGG GGG AGC GCT GCC AG - #G AAC ACG                     #     78A CTC GCC AGT CTT CTC AGC CCG GGC GC - #C AAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:7:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DR4     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                 #     39C CAA GTC AGC GGG GGG AGC GCC GCT CG - #C ACC GTG                     #     78A CTC GCT GGT CTC TTC GAC CAG GGC GC - #G CGG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:8:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DR1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                 #     39C CAT GTC ACT GGG GGA AGT GAA GCT CG - #C GCC GCG                     #     78A CTC ACT GGT CTC TTC ACG CGG GGC GC - #G CGG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:9:                                            -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  108 bas - #e pairs                                               (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # D3      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                 #     39A GGC GTG GGC ACC CAC ACG ATA GGG GG - #G GCG CAA                     #     78C AGC GTT AGG GGG TTC ACG TCC ATA TT - #T TCA ACT                     #          108     AG ATC CAG CTT GTA AAC                                     - (2) INFORMATION FOR SEQ ID NO:10:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  108 bas - #e pairs                                               (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # D1      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                #     39A TCC CCG GGC ACC CGC ACG ATA GGG GG - #G TCG CAA                     #     78A CAC ACT AGC AGT ATC GTG TCC ATG TT - #C TCA CTT                     #          108     AA ATC CAG CTT GTA AAC                                     - (2) INFORMATION FOR SEQ ID NO:11:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # P10     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                #     39C CAC ACG ACG GGG GGG TCG GTG GCC TA - #C GGC ACC                     #     78G TTT ACG TCC CTC TTT ACA TCT GGG GC - #G TCT CAG                     #  96              TG AAC                                                     - (2) INFORMATION FOR SEQ ID NO:12:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T10     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                #     39C CGC GTA ACA GGG GGA ACG GCA GCC CG - #C AAC ACC                     #     78G CTC GCG TCC ATC TTT GCA CCT GGG GC - #G TCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:13:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK5     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                #     39C CAC GTG ACA GGG GGT ACT GCA GCC CA - #C ACC ACT                     #     78G CTC ACG TCC CTG TTC GCC CCT GGG CC - #T TCT CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:14:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK8     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                #     39C TAC GTG TCA GGG GGT GCG ACA GCC CG - #C AAC ACT                     #     78G CTT ACG TCC CTC TTC ACC CCA GGG GC - #T GCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:15:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T3      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                #     39C CAC GTG TCA GGG GGG GTG TCG GCT CG - #C ACC ACC                     #     78G CTG GCA TCC TTC TTT TCA CCT GGG CC - #G TCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:16:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SW2     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                #     39C TAC ACG ACA GGG GGA GAG GCA GCC TA - #C AAT ACC                     #     78C TTT GCG AGT ATC TTC TCA AGC GGG CC - #G TCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:17:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA10    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                #     39C TAC ACG ACA GGG GGG GCG CAA GGC CG - #C ACC ACC                     #     78C TTC GTG GGT CTC TTC ACC CCT GGG CC - #G TCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:18:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # US6     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                #     39T CAC GTG ACG GGG GGG GCG CAA GCC TA - #C GCC GCC                     #     78T TTC ACG TCT CTC TTC ACA CCT GGG TC - #A CGT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:19:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # IND5    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                #     39C AAG ACA ATA GGG GGG CGC CAA GCC CA - #C ACC ACC                     #     78C CTT GTG TCT ATG TTC ACC CCT GGG CC - #G TCC CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:20:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # IND8    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                #     39C AAC ATA ATA GGG GGG AGG GAA GCC TC - #C ACC ACC                     #     78C TTT ACG AGT CTT TTC AGC CCT GGA GC - #G TCC CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:21:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK3     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                #     39C CAC ACG ATA GGG GCA ACT GTG GCC CG - #C ACC ACT                     #     78T TGG ACG GGC TTC TTC AGC TCC GGG CC - #C TCT CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:22:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S9      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                #     39C ACC GTG ACG GGA GCG GTG CAA GGC CG - #T TCC CTC                     #     78G CTC ACT GGC CTT TTT TCC TCT GGA CC - #G ACT CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:23:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK4     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:                                #     39C TAC GTG ACA GGG GGG GCG GCA AGC CA - #T TCC ACC                     #     78G CTC ACG TCC CTT TTC ACA ACG GGG GC - #G TCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:24:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S45     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:                                #     39C TAC ACG TCG GGG CAG GCG GCG GGC CG - #C ACC ACC                     #     78G TTT ACG TCC ATC TTT AAC CCT GGG TC - #G GCT CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:25:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:                                #     39C CAC GTG ACG GGG GCG GTG CAG GGC CG - #C ACC ACC                     #     78T TTC GCG TCC CTC TTC TCA CCC GGA TC - #G GCC CAG                     #  96              TA AAC                                                     - (2) INFORMATION FOR SEQ ID NO:26:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # US10    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:                                #     39C AGG ACG GTT GGG CAT TCT GCA GCG TA - #C ACC GCC                     #     78T TTC GCC GGC ATC TTC AAC GCT GGC TC - #T AGG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:27:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T4      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:                                #     39C ACC ACC ATT GGG AGT GCT GTC GCG AG - #C ACC ACC                     #     78C CTC ACC GGC TTG TTC TCC CCA GGC TC - #T CAG CAG                     #  96              TT AAC                                                     - (2) INFORMATION FOR SEQ ID NO:28:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T9      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:                                #     39C CAT ACA TCT GGG GGC ACC GCC GGG CA - #T ACA GCC                     #     78C CTC ACC AGC ATC TTC AGC CCT GGC GC - #C CGG CAG                     #  96              TT TAT                                                     - (2) INFORMATION FOR SEQ ID NO:29:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T2      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:                                #     39C GAG CTC ACC GGG AGT AAT GCC GGG CG - #T ACC ACC                     #     78C CTC GCT GCC TTC TTC ACC CCT GGC GC - #T AGC CAG                     #  96              TT AAC                                                     - (2) INFORMATION FOR SEQ ID NO:30:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T8      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:                                #     39C TAT ACT ACC GGC GCA CAA GTG GCT CG - #T ACC ACT                     #     78T CTT GCC GGC CTC TTC ACC ACC GGT CC - #T CAG CAG                     #  96              TC AAT                                                     - (2) INFORMATION FOR SEQ ID NO:31:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK8     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:                                #     39T TAT ACC ACC GGC GGA CAA GCG GCT AG - #G GAC ACC                     #     78G CTT GCT CGC CTC TTC TCC CCT GGC GC - #C CAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:32:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK11    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:                                #     39C CGT GTC ACC GGC GCG ATC GCG GGT CG - #G ACC GCC                     #     78G CTT GCT AGC CTC TTT AAC TCT GGC CC - #C CAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:33:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S83     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:                                #     39T TAT ACC ACT GGA GCA TCT GCT GGA CA - #G CAG GTA                     #     78C TTC GCC AGA CTC TTC AGT CCG GGG CC - #C AAC CAG                     #  96              TC CGC                                                     - (2) INFORMATION FOR SEQ ID NO:34:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK10    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:                                #     39A TAT ATC AGT GGT GGC CAC GTG GCT CG - #T GGT GCC                     #     78G CTC GCC AGC TTT TTT TCT CCG GGC GC - #C AAA CAG                     #  96              TC AAT                                                     - (2) INFORMATION FOR SEQ ID NO:35:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S2      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:                                #     39A TAT GTC ACC GGT GGC AGT GCA GCT CG - #T AGT GCT                     #     78G CTA GCT AGC TTC TTT TCT CCG GGC GC - #C CAG CAG                     #  96              TT AAC                                                     - (2) INFORMATION FOR SEQ ID NO:36:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S52     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:                                #     39A TAT GTC ACC GGT GGC AGT GTA GCT CA - #T AGT GCT                     #     78G TTA ACT AGC CTT TTT AGT ATG GGC GC - #C AAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:37:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S54     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:                                #     39A TAT ACC ACC GGT GGC AGT GCA GCT CA - #T AGT GCC                     #     78G ATA ACT CGC CTT TTT AGT GTG GGC GC - #C AAA CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:38:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK12    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:                                #     39A CAC GTC ACC GGT GGC GAT GCA GCT CG - #T AGT ACC                     #     78G TTT ACT AGC CTT TTT AGT GTG GGC TC - #C AAC CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:39:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z4      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:                                #     39A TCT GTC AGC GGG GGC ACT CAG GCC CG - #A GCA GCC                     #     78G TTG ACC AGC CTC TTT ACA TCT GGG CC - #C AGA CAA                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:40:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z1      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:40:                                #     39G TAC GCT TCT GGC GCT GCG GCC GGC CG - #A ACC ACC                     #     78C TTT GCC GGC CTA TTT ACC CCT GGC GC - #C AAG CAG                     #  96              TC AAC                                                     - (2) INFORMATION FOR SEQ ID NO:41:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z7      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:                                #     39C ATG ACA ACC GGG GGA GCT GCT GCC CG - #C ACT GCC                     #     78C TTC ACC GGC CTT TTC ACT TCT GGG CC - #C CAG CAA                     #  96              TT AAC                                                     - (2) INFORMATION FOR SEQ ID NO:42:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z6      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42:                                #     39C GTG ACA ACT GGG GGA AGC GTT GCT CG - #C AGC ACC                     #     78C ATT ACT AGC CTC TTC AAT TCT GGG CC - #T AAG CAG                     #  96              TT AAT                                                     - (2) INFORMATION FOR SEQ ID NO:43:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK13    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43:                                #     39C TAC GTC ACC GGG GGC CAG GCG GGA CA - #G ACC GCG                     #     78C CTT ACC GGA CTG TTC ACC AGG GGT TC - #C CAC CAG                     #  96              TT AAC                                                     - (2) INFORMATION FOR SEQ ID NO:44:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA6     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44:                                #     39C CAC AGT GTG GGG GGC TCT GCA GCT CA - #T ACT ACG                     #     78C TTT ACC TCA CTT TTC AAC CCC GGG CC - #G AAG CAG                     #  96              TA TAC                                                     - (2) INFORMATION FOR SEQ ID NO:45:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:                                #     39C CAC ACC GTG GCC GGT ACC GCT GCT TA - #C AGT ACG                     #     78C TTT GCC TCG ATT TTC ACC CCC GGG CC - #A AAG CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:46:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA13    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:                                #     39C CGC ACT GTG GGT GGT AGT GCG GCC CA - #A GGC GCG                     #     78G CTC GCT TCA CTT TTC ACC CCT GGG CC - #G CAG CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:47:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA4     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47:                                #     39C CAC ATT TCG GGC GGT ACT GCT GCT AA - #A ACT GTG                     #     78T TTT ACT TCA CTT TTC TCC TTC GGG GC - #A CAG CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:48:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  96 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA7     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48:                                #     39T CAC GTT GTG GGC GGT GCC GCT GCT CG - #T AGT GCG                     #     78C ATG GCC TCA CTC TTT ACT GTC GGG GC - #A AAG CAG                     #  96              TA AAT                                                     - (2) INFORMATION FOR SEQ ID NO:49:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  93 base - # pairs                                                (B) TYPE:  nucleic a - #cid                                                   (C) STRANDEDNESS:  sing - #le                                                 (D) TOPOLOGY:  linear                                               -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK2     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49:                                #     39C ACC ACC GGC CAC GCA GTG GGC CGC AC - #A ACC TCC                     #     78T GCC GGG CTT TTC TCC CCC GGT GCC AA - #G CAA AAT                     #    93            AC                                                         - (2) INFORMATION FOR SEQ ID NO:50:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S18     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:50:                                - Asp Thr Tyr Ala Thr Gly Gly Ser Ala Ser Ar - #g Thr                         #                 10                                                          - Thr Gln Ala Phe Thr Arg Phe Phe Ser Pro Gl - #y Ala                         #         20                                                                  - Lys Gln Asp Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:51:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S14     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51:                                - Asp Thr Tyr Ile Thr Gly Gly Thr Ala Gly Ar - #g Thr                         #                 10                                                          - Val Gly Thr Leu Ser Asn Leu Leu Ala Pro Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:52:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK7     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52:                                - Ser Thr His Val Thr Gly Gly Thr Ala Ala Ar - #g Ala                         #                 10                                                          - Ala Phe Gly Ile Thr Ser Leu Phe Ala Pro Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Ile Gln Leu Ile Ser                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:53:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # US11    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53:                                - Glu Thr Tyr Val Thr Gly Gly Ser Ala Gly Hi - #s Ala                         #                 10                                                          - Ala Ser Gly Leu Ala Gly Leu Phe Ser Gln Gl - #y Ala                         #         20                                                                  - Gln Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:54:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SW1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54:                                - Glu Thr Tyr Thr Thr Gly Gly Ala Ala Gly Gl - #n Thr                         #                 10                                                          - Ala Ser Gly Phe Thr Ser Leu Phe Thr Arg Gl - #y Ala                         #         20                                                                  - Gln Gln Asn Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:55:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK9     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:55:                                - Asp Thr Arg Val Thr Gly Gly Ser Ala Ala Ar - #g Asn                         #                 10                                                          - Thr Tyr Gly Leu Ala Ser Leu Leu Ser Pro Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:56:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DR4     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:                                - Gly Thr Gln Val Ser Gly Gly Ser Ala Ala Ar - #g Thr                         #                 10                                                          - Val Asn Ala Leu Ala Gly Leu Phe Asp Gln Gl - #y Ala                         #         20                                                                  - Arg Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:57:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DR1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57:                                - Thr Thr His Val Thr Gly Gly Ser Glu Ala Ar - #g Ala                         #                 10                                                          - Ala Ser Ala Leu Thr Gly Leu Phe Thr Arg Gl - #y Ala                         #         20                                                                  - Arg Gln Asn Val Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:58:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  36 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # D3      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58:                                - Arg Gly Gly Val Gly Thr His Thr Ile Gly Gl - #y Ala                         #                 10                                                          - Gln Ala Tyr Ser Val Arg Gly Phe Thr Ser Il - #e Phe                         #         20                                                                  - Ser Thr Gly Pro Ala Gln Lys Ile Gln Leu Va - #l Asn                         # 35                                                                          - (2) INFORMATION FOR SEQ ID NO:59:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  36 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # D1      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59:                                - Ser Ala Ser Pro Gly Thr Arg Thr Ile Gly Gl - #y Ser                         #                 10                                                          - Gln Ala Lys His Thr Ser Ser Ile Val Ser Me - #t Phe                         #         20                                                                  - Ser Leu Gly Pro Ser Gln Lys Ile Gln Leu Va - #l Asn                         # 35                                                                          - (2) INFORMATION FOR SEQ ID NO:60:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # P10     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:60:                                - Arg Thr His Thr Thr Gly Gly Ser Val Ala Ty - #r Gly                         #                 10                                                          - Thr Arg Arg Phe Thr Ser Leu Phe Thr Ser Gl - #y Ala                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:61:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T10     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61:                                - Ser Thr Arg Val Thr Gly Gly Thr Ala Ala Ar - #g Asn                         #                 10                                                          - Thr Tyr Gly Leu Ala Ser Ile Phe Ala Pro Gl - #y Ala                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:62:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK5     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62:                                - Ala Thr His Val Thr Gly Gly Thr Ala Ala Hi - #s Thr                         #                 10                                                          - Thr Arg Gly Leu Thr Ser Leu Phe Ala Pro Gl - #y Pro                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:63:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK8     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63:                                - Asp Thr Tyr Val Ser Gly Gly Ala Thr Ala Ar - #g Asn                         #                 10                                                          - Thr Tyr Gly Leu Thr Ser Leu Phe Thr Pro Gl - #y Ala                         #         20                                                                  - Ala Gln Lys Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:64:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T3      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64:                                - Thr Thr His Val Ser Gly Gly Val Ser Ala Ar - #g Thr                         #                 10                                                          - Thr His Gly Leu Ala Ser Phe Phe Ser Pro Gl - #y Pro                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:65:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SW2     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:65:                                - Asn Thr Tyr Thr Thr Gly Gly Glu Ala Ala Ty - #r Asn                         #                 10                                                          - Thr Arg Gly Phe Ala Ser Ile Phe Ser Ser Gl - #y Pro                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:66:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA10    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66:                                - Gly Thr Tyr Thr Thr Gly Gly Ala Gln Gly Ar - #g Thr                         #                 10                                                          - Thr Ser Ser Phe Val Gly Leu Phe Thr Pro Gl - #y Pro                         #         20                                                                  - Ser Gln Arg Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:67:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # US6     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67:                                - Glu Thr His Val Thr Gly Gly Ala Gln Ala Ty - #r Ala                         #                 10                                                          - Ala Arg Ser Phe Thr Ser Leu Phe Thr Pro Gl - #y Ser                         #         20                                                                  - Arg Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:68:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # IND5    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68:                                - Gln Ala Lys Thr Ile Gly Gly Arg Gln Ala Hi - #s Thr                         #                 10                                                          - Thr Gly Arg Leu Val Ser Met Phe Thr Pro Gl - #y Pro                         #         20                                                                  - Ser Gln Asn Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:69:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # IND8    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69:                                - His Thr Asn Ile Ile Gly Gly Arg Glu Ala Se - #r Thr                         #                 10                                                          - Thr Gln Gly Phe Thr Ser Leu Phe Ser Pro Gl - #y Ala                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:70:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK3     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:70:                                - Ser Thr His Thr Ile Gly Ala Thr Val Ala Ar - #g Thr                         #                 10                                                          - Thr Gln Ser Trp Thr Gly Phe Phe Ser Ser Gl - #y Pro                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:71:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S9      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71:                                - Gly Thr Thr Val Thr Gly Ala Val Gln Gly Ar - #g Ser                         #                 10                                                          - Leu Gln Gly Leu Thr Gly Leu Phe Ser Ser Gl - #y Pro                         #         20                                                                  - Thr Gln Lys Leu Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:72:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK4     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:72:                                - Asn Thr Tyr Val Thr Gly Gly Ala Ala Ser Hi - #s Ser                         #                 10                                                          - Thr Arg Gly Leu Thr Ser Leu Phe Thr Thr Gl - #y Ala                         #         20                                                                  - Ser Gln Lys Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:73:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S45     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73:                                - Gly Thr Tyr Thr Ser Gly Gln Ala Ala Gly Ar - #g Thr                         #                 10                                                          - Thr Ala Gly Phe Thr Ser Ile Phe Asn Pro Gl - #y Ser                         #         20                                                                  - Ala Gln Ser Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:74:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74:                                - Thr Thr His Val Thr Gly Ala Val Gln Gly Ar - #g Thr                         #                 10                                                          - Thr Gln Gly Phe Ala Ser Leu Phe Ser Pro Gl - #y Ser                         #         20                                                                  - Ala Gln Lys Ile Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:75:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # US10    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:75:                                - Ala Thr Arg Thr Val Gly His Ser Ala Ala Ty - #r Thr                         #                 10                                                          - Ala Ser Thr Phe Ala Gly Ile Phe Asn Ala Gl - #y Ser                         #         20                                                                  - Arg Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:76:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T4      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76:                                - Ser Ser Thr Thr Ile Gly Ser Ala Val Ala Se - #r Thr                         #                 10                                                          - Thr Arg Gly Leu Thr Gly Leu Phe Ser Pro Gl - #y Ser                         #         20                                                                  - Gln Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:77:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T9      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:                                - Thr Thr His Thr Ser Gly Gly Thr Ala Gly Hi - #s Thr                         #                 10                                                          - Ala Tyr Gly Leu Thr Ser Ile Phe Ser Pro Gl - #y Ala                         #         20                                                                  - Arg Gln Lys Ile Gln Leu Ile Tyr                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:78:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T2      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:                                - His Thr Glu Leu Thr Gly Ser Asn Ala Gly Ar - #g Thr                         #                 10                                                          - Thr Gln Gly Leu Ala Ala Phe Phe Thr Pro Gl - #y Ala                         #         20                                                                  - Ser Gln Arg Val Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:79:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # T8      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79:                                - Thr Thr Tyr Thr Thr Gly Ala Gln Val Ala Ar - #g Thr                         #                 10                                                          - Thr Ala Ser Leu Ala Gly Leu Phe Thr Thr Gl - #y Pro                         #         20                                                                  - Gln Gln Lys Ile Asn Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:80:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK8     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:80:                                - Ala Thr Tyr Thr Thr Gly Gly Gln Ala Ala Ar - #g Asp                         #                 10                                                          - Thr Trp Gly Leu Ala Arg Leu Phe Ser Pro Gl - #y Ala                         #         20                                                                  - Gln Gln Lys Leu Ser Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:81:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK11    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81:                                - Asn Thr Arg Val Thr Gly Ala Ile Ala Gly Ar - #g Thr                         #                 10                                                          - Ala Ala Ser Leu Ala Ser Leu Phe Asn Ser Gl - #y Pro                         #         20                                                                  - Gln Gln Lys Ile Asn Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:82:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S83     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82:                                - Thr Thr Tyr Thr Thr Gly Ala Ser Ala Gly Gl - #n Gln                         #                 10                                                          - Val Gln Ser Phe Ala Arg Leu Phe Ser Pro Gl - #y Pro                         #         20                                                                  - Asn Gln His Val Gln Leu Val Arg                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:83:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK10    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83:                                - Gly Thr Tyr Ile Ser Gly Gly His Val Ala Ar - #g Gly                         #                 10                                                          - Ala Ser Gly Leu Ala Ser Phe Phe Ser Pro Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:84:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S2      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84:                                - Glu Thr Tyr Val Thr Gly Gly Ser Ala Ala Ar - #g Ser                         #                 10                                                          - Ala Ser Arg Leu Ala Ser Phe Phe Ser Pro Gl - #y Ala                         #         20                                                                  - Gln Gln Lys Leu Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:85:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S52     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:85:                                - Glu Thr Tyr Val Thr Gly Gly Ser Val Ala Hi - #s Ser                         #                 10                                                          - Ala Arg Gly Leu Thr Ser Leu Phe Ser Met Gl - #y Ala                         #         20                                                                  - Lys Gln Lys Leu Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:86:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # S54     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86:                                - Ala Thr Tyr Thr Thr Gly Gly Ser Ala Ala Hi - #s Ser                         #                 10                                                          - Ala Gln Gly Ile Thr Arg Leu Phe Ser Val Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Leu Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:87:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK12    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87:                                - Thr Thr His Val Thr Gly Gly Asp Ala Ala Ar - #g Ser                         #                 10                                                          - Thr Leu Arg Phe Thr Ser Leu Phe Ser Val Gl - #y Ser                         #         20                                                                  - Asn Gln Gln Leu Gln Leu Val Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:88:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z4      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88:                                - His Thr Ser Val Ser Gly Gly Thr Gln Ala Ar - #g Ala                         #                 10                                                          - Ala Gln Gly Leu Thr Ser Leu Phe Thr Ser Gl - #y Pro                         #         20                                                                  - Arg Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:89:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z1      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:89:                                - Thr Thr Tyr Ala Ser Gly Ala Ala Ala Gly Ar - #g Thr                         #                 10                                                          - Thr Ser Gly Phe Ala Gly Leu Phe Thr Pro Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Ile Arg Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:90:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z7      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:90:                                - Thr Thr Met Thr Thr Gly Gly Ala Ala Ala Ar - #g Thr                         #                 10                                                          - Ala His Ala Phe Thr Gly Leu Phe Thr Ser Gl - #y Pro                         #         20                                                                  - Gln Gln Lys Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:91:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # Z6      (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:91:                                - Glu Thr Val Thr Thr Gly Gly Ser Val Ala Ar - #g Ser                         #                 10                                                          - Thr Arg Ala Ile Thr Ser Leu Phe Asn Ser Gl - #y Pro                         #         20                                                                  - Lys Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:92:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # DK13    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92:                                - Gly Thr Tyr Val Thr Gly Gly Gln Ala Gly Gl - #n Thr                         #                 10                                                          - Ala Phe His Leu Thr Gly Leu Phe Thr Arg Gl - #y Ser                         #         20                                                                  - His Gln Asn Ile Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:93:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA6     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:93:                                - Ser Thr His Ser Val Gly Gly Ser Ala Ala Hi - #s Thr                         #                 10                                                          - Thr Ser Gly Phe Thr Ser Leu Phe Asn Pro Gl - #y Pro                         #         20                                                                  - Lys Gln Asn Leu Gln Leu Ile Tyr                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:94:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA1     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:94:                                - Arg Thr His Thr Val Ala Gly Thr Ala Ala Ty - #r Ser                         #                 10                                                          - Thr Arg Gly Phe Ala Ser Ile Phe Thr Pro Gl - #y Pro                         #         20                                                                  - Lys Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:95:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA13    (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:95:                                - Asn Thr Arg Thr Val Gly Gly Ser Ala Ala Gl - #n Gly                         #                 10                                                          - Ala Arg Gly Leu Ala Ser Leu Phe Thr Pro Gl - #y Pro                         #         20                                                                  - Gln Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:96:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA4     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:96:                                - Asn Thr His Ile Ser Gly Gly Thr Ala Ala Ly - #s Thr                         #                 10                                                          - Val Gln Gly Phe Thr Ser Leu Phe Ser Phe Gl - #y Ala                         #         20                                                                  - Gln Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:97:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  32 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # SA7     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:97:                                - Asn Thr His Val Val Gly Gly Ala Ala Ala Ar - #g Ser                         #                 10                                                          - Ala Ser Gly Met Ala Ser Leu Phe Thr Val Gl - #y Ala                         #         20                                                                  - Lys Gln Asn Leu Gln Leu Ile Asn                                             # 30                                                                          - (2) INFORMATION FOR SEQ ID NO:98:                                           -      (i) SEQUENCE CHARACTERISTICS:                                                    (A) LENGTH:  31 amin - #o acids                                               (B) TYPE:  amino aci - #d                                                     (C) STRANDEDNESS:  unkn - #own                                                (D) TOPOLOGY:  unknown                                              -     (vi) ORIGINAL SOURCE:                                                             (A) ORGANISM:  homosapi - #ens                                      # HK2     (C) INDIVIDUAL ISOLATE:                                             -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:98:                                - Thr Thr Thr Thr Gly His Ala Val Gly Arg Th - #r Thr                         #                 10                                                          - Ser Ser Leu Ala Gly Leu Phe Ser Pro Gly Al - #a Lys                         #         20                                                                  - Gln Asn Leu Gln Leu Ile Asn                                                 # 30                                                                          __________________________________________________________________________

We claim:
 1. A purified and isolated protein having an amino acidsequence selected from the group consisting of SEQ ID NO:50 through SEQID NO:98.
 2. A method of preventing hepatitis C, comprisingadministering the composition of claim 1 to a mammal in an amounteffective to stimulate the production of protective antibody.
 3. Avaccine for immunizing a mammal against hepatitis C comprising at leastone protein according to claim 1 in a pharmacologically acceptablecarrier.
 4. A protein produced by a host organism transformed ortransfected with a recombinant expression vector comprising a nucleicacid sequence which encodes an amino acid sequence selected from thegroup consisting of SEQ ID NO:50 through SEQ ID NO:98.
 5. A vaccine forimmunizing a mammal against hepatitis C comprising at least one proteinaccording to claim
 4. 6. A composition comprising at least one proteinof claim 1 and an excipient, diluent or carrier.
 7. A purified andisolated amino acid molecule having a sequence consisting of from six tothirty two amino acids, where said sequence is a contiguous sequencefound in a sequence selected from the group consisting of SEQ ID NO:50through SEQ ID NO:57 and SEQ ID NO:60 through SEQ ID NO:98.
 8. Acomposition comprising at least one molecule of claim 7 and anexcipient, diluent or carrier.
 9. A method of preventing hepatitis C,said method comprising administering the composition of claim 8 to amammal in an amount effective to stimulate the production of protectiveantibody.
 10. A vaccine for immunizing a mammal against hepatitis C,said vaccine comprising at least one molecule of claim 7 in apharmacologically acceptable carrier.
 11. A protein produced by a hostorganism transformed or transfected with a recombinant expression vectorcomprising a nucleic acid sequence which encodes an amino acid sequenceconsisting of from six to thirty two amino acids, where said sequence isa contiguous sequence found in a sequence selected from the groupconsisting of SEQ ID NO:50 through SEQ ID NO:57 and SEQ ID NO:60 throughSEQ ID NO:98.
 12. A purified and isolated amino acid molecule having asequence consisting of from six to thirty two amino acids, where saidsequence is a contiguous sequence found in a sequence selected from thegroup consisting of SEQ ID NO:58 and SEQ ID NO:59.
 13. A compositioncomprising at least one molecule of claim 12 and an excipient, diluentor carrier.
 14. A method of preventing hepatitis C, said methodcomprising administering the composition of claim 13 to a mammal in anamount effective to stimulate the production of protective antibody. 15.A vaccine for immunizing a mammal against hepatitis C, said vaccinecomprising at least one molecule of claim 12 in a pharmacologicallyacceptable carrier.
 16. A protein produced by a host organismtransformed or transfected with a recombinant expression vectorcomprising a nucleic acid sequence which encodes an amino acid sequenceconsisting of from six to thirty two amino acids, where said sequence isa contiguous sequence found in a sequence selected from the groupconsisting of SEQ ID NO:50 through SEQ ID NO:98.